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Abstract — We study how the behavior of viral spreading pro- 
cesses is influenced by local structural properties of the network 
over which they propagate. For a wide variety of spreading 
processes, the largest eigenvalue of the adjacency matrix of the 
network plays a key role on their global dynamical behavior. 
For many real-world large-scale networks, it is unfeasible to 
exactly retrieve the complete network structure to compute its 
largest eigenvalue. Instead, one usually have access to myopic, 
egocentric views of the network structure, also called egonets. 
In this paper, we propose a mathematical framework, based 
on algebraic graph theory and convex optimization, to study 
how local structural properties of the network constrain the 
interval of possible values in which the largest eigenvalue must 
lie. Based on this framework, we present a computationally 
efficient approach to find this interval from a collection of 
egonets. Our numerical simulations show that, for several social 
and communication networks, local structural properties of the 
network strongly constrain the location of the largest eigenvalue 
and the resulting spreading dynamics. From a practical point of 
view, our results can be used to dictate immunization strategies 
to tame the spreading of a virus, or to design network topologies 
that facilitate the spreading of information virally. 

Index Terms — Complex Networks, Virus Spreading, Algebraic 
Graph Theory, Convex Optimization. 



I. Introduction 

Understanding the behavior of viral spreading processes 
taking place in large complex networks is of critical interest in 
mathematical epidemiology (TJ, 0. Spreading processes are 
relevant in many real scenarios, such as disease spreading in 
human populations |3]-|[5], malware propagation in computer 
networks |6|-[7|, or information dissemination in online social 
networks [8]-[9]. To study viral spreading processes, a variety 
of stochastic dynamical models has been proposed in the 
literature [10|-[14|. In these models, the steady-state infection 
of the network presents two different regimes depending on 
the virulence of the infection and the structure of the network 
of contacts. In one of the regimes, an initial infection dies 
out at a fast (usually exponential) rate. In the other regime, 
an initial infection becomes an epidemic. Both numerical and 
analytical results show that these two regimes are separated 
by a phase transition at an epidemic threshold determined 
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by both the virulence of the infection and the topology 
of the network. One of the most fundamental questions in 
mathematical epidemiology is to find the value of the epidemic 
threshold in terms of the virus model and the contact network. 

In many cases of practical interest it is unfeasible to exactly 
retrieve the complete structure of a network of contacts. In 
these cases, it is impossible to exactly compute the epidemic 
threshold. On the other hand, in most cases one can easily 
retrieve the structure of egocentric views of the network, 
also called egonet]^ To estimate the value of the epidemic 
threshold, researchers have proposed a variety of random 
network models in which they can prescribe structural prop- 
erties that can be retrieved from these egonets, such as the 
degree distribution [15|, [16|, local correlations ifTTl . fl8l . or 
clustering |19|. 

Although random networks are the primary tool to study the 
impact of local structural features on the epidemic threshold 
ll20ll . this approach presents a major flaw: Random network 
models implicitly induce many structural properties that are 
not directly controlled but can have a strong influence on the 
value of the epidemic threshold. For example, it is possible 
to find two networks having the same degree distribution, but 
with opposite dynamical behavior [21 J. Therefore, it is difficult 
(if not impossible) to isolate the role of a particular structural 
property in the network performance using random network 
models. Furthermore, many real networks present weighted 
edges representing, for example, bandwidth in communication 
networks or resistance in electric networks. Current random 
networks fail to faithfully recover both the structure of the 
network and the distribution of weights over the links. In 
this paper, we develop a mathematical framework, based on 
algebraic graph theory and convex optimization, to study how 
the structure of local egonets constrain the interval of possible 
values in which the epidemic threshold must lie. As a result of 
our analysis, we present a computationally efficient approach 
to find this interval from a collection of egonets extracted from 
a (possibly) weighted network. Our numerical simulations 
show that the resulting interval is very narrow for several social 
and communication networks. This illustrates the fact that, for 
many real networks, local structural properties of the network 
strongly constrain the location of the viral epidemic threshold. 

The rest of this paper is organized as follows. In Section [II] 
we review terminology and existing results relating the dynam- 
ical behavior of a virus model with spectral properties of the 



network of contacts. In Section III we introduce an approach 



based on algebraic graph theory and convex optimization, 



'A rigorous definition of egonet, in graph-theoretical terms, will be given 
in Section IIIII 



2 



to find upper and lower bounds on the epidemic thresholds 
from local egonets. In Subsection III-A we introduce an 



approach to related these egonets to the so-called spectral 
moments of the adjacency matrix. In Subsection III-B we 



propose an optimization framework to derive bounds on the 
epidemic threshold from a collection of spectral moments. 
In Section |IV] we illustrate the quality of our approach by 
computing bounds on the epidemic threshold for real-world 
social and communication networks. 



II. Notation & Preliminaries 

Let Q = (V, £) be an undirected, unweighted graph, where 
V = {1, . . . ,n} denotes a set of n nodes and £ C V x V 
denotes a set of undirected edges linking them. If € £, 
we call nodes i and j adjacent (or first-neighbors), which 
we denote by i ~ j. We define the set of first-neighbors 
of a node i as Mi = {j € V : € £}■ The degree 

di of a vertex i is the number of nodes adjacent to it, i.e., 
di = \J\fi\. A graph is weighted if there is a real number 
Wij associated with every edge E £■ More 

formally, a weighted graph H can be defined as the triad 
H = (V,£, W), where V and £ are the sets of nodes and edges 
in H, and W = {w l0 € R\ {0} , for all € £} is the set 

of (possibly negative) weights. 

The adjacency matrix of a simple graph Q, denoted by 
Ag = [ciij], is an n x n symmetric matrix defined entry- 
wise as djj = 1 if nodes i and j are adjacent, and ay = 
otherwise. For weighted graphs, the entry is equal to 
the weight for (i,j) £ £; 0, otherwise. For undirected 
graphs, Ag is a symmetric matrix; thus, Ag has a full set 
of n real and orthogonal eigenvectors with real eigenvalues 
Ai > A2 > ... > A„. The largest eigenvalue of Ag, Ai, 
is called the spectral radius of Ag. If A has nonnegative 
entries and is irreducible (i.e., Q is connected), then the Perron- 
Frobenius theorem [22| can be used to show that the spectral 
radius Ai is unique, real, and positive. We also define the /c-th 
spectral moment of Q as: 



1 

i=l 



(1) 



A walk of length k from node i\ to node i^+i is an ordered 
sequence of nodes (ii, ^2? - *fc+i) sucn that ij ~ for 
j = 1,2, One says that the walk touches each of the 
nodes that comprises it. If ij = then the walk is closed. 
A closed walk with no repeated nodes (with the exception 
of the first and last nodes) is called a cycle. Given a walk 
P = {iii i-2, ik+i) in a weighted graph W, we define the 
weight of the walk as, u> {p) 



Wi 



■w ik 



A. Stochastic Modeling of Viral Spreading 

A wide variety of stochastic models has been proposed 
in the literature to study the dynamics of virus spreading 
processes. In most models, the steady-state level of infection 
in the network presents two different regimes separated by a 
phase transition taking place at an epidemic threshold. This 
epidemic threshold is determined by both the virulence of 



the infection and the network topology. A series of papers 
study the value of this epidemic threshold as a function of 
the network structure, in both random [23|-[27| and real 
topologies lfT0l - lfT4l . A spreading model widely considered 
in the literature is the so-called SIS (Susceptible-Infected- 
Susceptible) model. In this model, each individual in the 
network can be in one of two possible states: susceptible 
or infected. Given an initial set of infected individuals, the 
virus propagates through the edges of an undirected graph Q 
at an infection rate (3. Simultaneously, infected nodes recover 
at a rate S, returning back to the susceptible state (see ifTUll 
for a formal description of this model). In lfl0l - lfT2"ll . we 
find different (and complementary) approaches to find an 
expression for the SIS epidemic threshold. In all of these 
papers, the authors are able to decouple the effect of the 
network topology from the dynamics of individual nodes. On 
the one hand, the effect of the node dynamics is completely 
characterized by the ratio tsis — S//3. On the other hand, 
the effect of the network topology depends exclusively on the 
largest eigenvalue of the network adjacency matrix, Ai {Ag), 
such that if the threshold condition Ai {A) < tsis = 5/ ft is 
satisfied, a 'small' initial infection dies out exponentially fast 

na-oa- 

Many extensions to the SIS model have been proposed to 
capture different characteristics of viral processes, such as 
permanent or temporal immunity of a recovered individual, 
or virus incubation time |[T3l . Ifl4l . As shown in [14], the 
decoupling argument that allows to separate the role of the 
network topology from the node dynamics in the SIS model 
still holds for a variety of other virus models. Similarly, a 
'small' initial infection dies out exponentially fast in these 
models if the condition Ai {Ag) < tvm i s satisfied, where 
the threshold tvm measures the virulence of the infection 
(and is independent of the network structure). As a bottom 
line, all of the above results remark the key role played by 
the largest eigenvalue of the adjacency matrix, Ai {Ag), in 
virus spreading processes. In particular, the larger Ai {Ag), 
the more efficient a network is to spread a disease (or a piece 
of information) virally. 

B. Spectral Estimators Based on Random Graphs 

Random network models are currently the primary tool to 
study the relationship between local structural properties of a 
network and its epidemic threshold. Although many random 
networks have been proposed in the literature lfl5ll - lfT9ll . only 
random networks including a very limited amount of structural 
information are currently amenable to analysis. The original 
random graph model is the Erdos-Renyi graph, denoted by 
G{n,p), in which each edge in a graph with n nodes is 
independently chosen with probability p, [28]. In this model, 
the distribution of degrees in the network follows a Poisson 
distribution with expectation E[d;] = {n — l)p. Furthermore, 
the largest eigenvalue of its adjacency matrix is almost surely 
Ai = [1 + o (1)] np (assuming that np = £1 (logn)). Although 
very interesting from a theoretical point of view, the original 
random graph presents very limited modeling capabilities, 
since the degree distributions of real-world networks almost 
never follow a Poisson distribution. 
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In order to increase the modeling abilities of random graphs, 
Chung et al. proposed in [16] a random graph G (w) in 
which one can prescribe a desired expected sequence of 
degrees, w = (w\, W n ). In this random graph, edges are 
independently assigned to each pair of vertices with 
probability WiWj/Y^k—i w k- Chung et al. proved in [16] that 

if Yh=i W f/Y^j=i w j > v^ max { w i}l°g n > tnen foe largest 
eigenvalue Ai (G(w)) converges almost surely 

A 1 (G(w)) a 4-[l + o(l)] ^r 1 ~ ; (2) 

for large n. Despite its theoretical interest, random graphs 
with a given degree distribution are by far not enough to 
faithfully model the structure of real complex networks. In 
particular, it is well-known that the degree distribution alone 
is not a sufficient statistic to analyze the performance of many 
networks. For example, Alderson et al. introduce in ETTl a 
collection of networks, including random graphs, presenting 
the same degree distribution and radically different dynamical 
performance. 

Although random graph models with more elaborated struc- 
tural properties can be found in the literature |[T5ll - |[T9l . these 
models are usually hard (if not impossible) to analyze from 
a spectral point of view. The source of this intractability is 
the presence of strong correlations among the entries of the 
(random) adjacency matrix. These strong correlations prevent 
the resulting random adjacency matrix from being analytically 
tractable. In the next section, we present a novel approach to 
analyze the effect of local structural properties on the largest 
eigenvalue of a network without making use of random graphs. 

III. Spectral Analysis from Egocentric 
Subnetworks 

In this section, we study the relationship between local 
structural properties of a network and its eigenvalue spectrum. 
In our analysis, we assume that we do not have access to the 
complete topology of the network, due to, for example, privacy 
and/or security constrains. Instead, we assume that we are able 
to access local egocentric views of the network topology. In 
this setting, we propose an approach to extract global spectral 
information from local structural properties of the network. 
This spectral information will be used in Subsection III-B to 



compute upper and lower bounds on the epidemic threshold. 

We now provide graph-theoretical and algebraic elements 
to characterize the information contained in these egocentric 
views of the network. Let 5 denote the distance between 
two nodes i and j (i.e., the minimum length of a walk from 
i to j). By convention, we assume that S(i,i) — 0. We 
define the r-th order neighborhood around node i, denoted 
by Qi.v = (-A/i,D £i,r)> as the subgraph Q i r C Q with node-set 
■A/"i,r - {j £ V : i < r }, an d edge-set £^ r — {(v,w) € £ 
s.t. v, w € Afi, r }- Notice that Q^ r provides a graph-theoretical 
description of the egocentric view of the network from node i 
within a radius of r hops. Motivated by this interpretation, we 
also call Qi ;r the egonet of radius r around node i. Egonets can 
be algebraically represented via submatrices of the adjacency 
matrix Ag, as follows. Given a set of k nodes JC C V, we de- 
note by Ag (JC) the k x k submatrix of Ag formed by selecting 



the rows and columns of Ag indexed by JC. In particular, we 
define the adjacency submatrix Ai jr = Ag (.A/i, r )- Notice that 
A i r is itself an adjacency matrix representing the structure of 
the egonet Q^ r . By convention, we associate the first row and 
column of the submatrix Ai, r with node i 6 V, which can 
be done via a simple permutation of the rows and columns of 
Ai^For a weighted graph % with weighted adjacency matrix 
Afi, we define the weighted egonet T-L^ r as the weighted graph 
whose adjacency matrix is A i r = A-^ r ). 

A. Spectral Moments from Local Egonets 

In this subsection, we derive expressions for the spectral 
moments of the adjacency from the knowledge of local egonets 
using tools from algebraic graph theory. The following lemma 
provides an interesting connection between the number of 
closed walks in Q (a combinatorial property) and its spectral 
moments (an algebraic property) [29]: 



Lemma 3.1: Let Q be a simple graph with adjacency matrix 



Ac 



Then 



i,k\ 



where W^k is the set of closed walks of length k starting and 
finishing at node i. 

Using the above result, one can prove the following well- 
known result in algebraic graph theory [29|: 

Corollary 3.2: Let Q be a simple graph. Denote by e and 
A the number of edges and triangles in Q, respectively. Then, 

m i{Ag) = 0, m 2 {Ag) = — , and m 3 (Ag) = —. 

n n 



We can generalize Lemma |3.1| to weighted graphs as fol- 
lows: 

Lemma 3.3: Let H = (V, £, W) be a weighted graph with 
weighted adjacency matrix A<n. Then, 

pePk,, 

where is the set of closed walks of length k from V{ to 
itself in %. 

Proof: By recursively applying the multiplication rule for 
matrices, we have the following expansion 



[Mi] 



n n n 
i=l i 2 = l ife=l 



(3) 



Using the graph-theoretic nomenclature introduced in Sec- 
tion [n] we have that Wi t i 2 Wi 2 i 3 ...Wi k ^ — u (p), for p = 
(uj, Vi 2 , Vi 3 , Vi k , Vj). Hence, the summations in <j3j can be 

written as = Ei< M2: ..., lfe <„ w (p)- Finall y> the set 

of closed walks p = (vi, Vi 2 , Vi 3 , Vi k , Vi) with indices 
1 < i,i2,...,ifc < n is equal to the set of closed walks of 
length k from Vi to itself in % (which we have denoted by 
Ph i in the statement of the Proposition). ■ 

2 Notice that permuting the rows and columns of the adjacency matrix does 
not change the topology of the underlying graph, it simply changes the labels 
associated to each node. 



4 




Fig. 1. Cycles C4 and C5, of lengths 4 and 5 , in a neighborhood of radius 
2 around node i . 



quantities [A^] , i = l,...,n. For a fixed k, each value 
[4j r ] 1;L , i = 1, . . . , n, can be computed in time O (jA/i !r | 3 ^, 
where \M%,r\ is the number of nodes in the local egonet 
Hi^r. The sparse structure of most real networks implies 
that \Ni, r \ <C n (for moderate values of r). In particular, 
if \Ni. r \ = o(n e ) for any e > 0, we can compute the fc-th 
spectral moments in quasi-linear time (with respect to the 
size of the network) using |4]). This result provides a clear 
computational advantage compared to computing the spectral 
moments via an explicit eigenvalue decomposition, which 
can be prohibitively expensive to compute for large complex 
networks. 



Using Lemma 3.3 we can extend Lemma 3.2 to higher- B. SDP-Based Bounds on the Spectral Radius 



order moments of weighted graphs as follows: 

Theorem 3.4: Consider a weighted, undirected graph T~L 
with adjacency matrix An- Let A; r be the (weighted) ad- 
jacency matrix of the egonet of radius r around node i. Then, 
for a given r, the spectral moments of An can be written as 



1 



m k (A n ) = ~^2[Al 



r i 11 



(4) 



for k < 2r + 1. 



Proof: Since the trace of a matrix is the sum of its 
eigenvalues, we can expand the fc-th spectral moment of the 
adjacency matrix as follows: 



TOfe (A ? 



1 



Trace (AfJ 



1 



[A] 



(5) 



From Lemma |3.3l we have that [A^]„ = 'J2 pePk w (p)- 
Notice that for anxed value of k, closed walks of length k 
in H starting at node i can only touch nodes within a certain 
distance r (k) of i, where r (k) is a function of k. In particular, 
for k even (resp. odd), a closed walk of length k starting at 
node i can only touch nodes at most k/2 (resp. L^/2J) hops 
away from i (see Fig. 1). Therefore, closed walks of length 
k starting at i are always contained within the neighborhood 
of radius |_A; / 2J . In other words, the egonet of radius r 
contains all closed walks of length up to 2r + 1 starting at 
node i. We can count these walks by applying Lemma [33] to 
the local adjacency matrix A; r . In particular, 



peP k 



'(P) 

is equal to [A* r ] n (since, by convention, node 1 in the local 
egonet "H, r corresponds to node i in the graph %). Therefore, 
for k < 2r + 1, we have that 



peP k ,i 



(6) 



Then, substituting Q into |5]), we obtain the statement of our 
Theorem. ■ 

Remark 3.1: The above theorem allows us to 
compute a truncated sequence of spectral moments 
{irik (A^) , k < 2r + 1}, given a collection of local 
egonets of radius r, {"Hi, r , i G V}. According to we can 
compute the fc-th spectral moment by simply averaging the 



Using Theorem |3.4| we can compute a truncated 
sequence of the spectral moments of a network T~L, 
(mi (A n ) ,m 2 {A u ) , ...,m 2r+ i (A n )), from a set of local 
egonets of radius r, {Hi. r , i € V}. We now present a con- 
vex optimization framework to extract information about the 
largest eigenvalue of the adjacency matrix, Ai {An), from this 
sequence of moments. We can state the problem solved in this 
subsection as follows: 

Problem 1: Given a truncated sequence of spectral mo- 
ments of a weighted, undirected graph H, m 2 ,+i = 
(mo, mi, m2r+i), find tight upper and lower bounds on 
the largest eigenvalue Ai (A%). 

Our approach is based on a probabilistic interpretation of 
the eigenvalue spectrum of a given network. To present our 
approach, we first need to introduce some concepts: 

Definition 3.1: Given a weighted, undirected graph H with 
(real) eigenvalues Ai,...,A n , the spectral density of H is 
defined as, 



1 " 



(7) 



where 5 (•) is the Dirac delta function. 



The spectral density can be interpreted as a discrete proba- 
bility density function with 5M/7/?or|jon the set of eigenvalues 
{A^ i = I...71}. Let us consider a discrete random variable 
X whose probability density function is fly The moments of 
this random variable satisfy the following: 

Lemma 3.5: The moments of a r.v. X ~ p, n are equal to 
the spectral moments of A-y, i.e., 



E„ w {X k ) =m k {A H ) 



for all k > 0. 



3 Recall that the support of a finite Borel measure fi on R, denoted by 
supp(fi), is the smallest closed set B such that fi(B\B) = 0. 
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Proof: For all k > 0, we have the following: 

JS. 

1 " f 
1 ™ 



Theorem 3.7: Let be a probability density function 
on M with associated sequence of moments M 2r+ i = 
(Mo, Mi, M 2r+1 ), all finite, and let [a, b] be the smallest in- 
terval which contains the support of fi. Then, b > f3* (M2 r +i), 
where 



K (M 



2r+l) 



imn x x 

s.t. H 2r y 0, 

X H 2r — H 2r +1 Cl 0. 



(9) 



We now present a convex optimization framework that al- 
lows us to find bounds on the endpoints of the smallest interval 
[a, b] containing the support of a generic random variable 
X ~ /i given a sequence of moments (Mo, Mi, M 2r+ i), 
where Mk = J x k dp. Subsequently, we shall apply these 
results to find bounds on Ai {An). Our formulation is based 
on the following matrices: 

Definition 3.2: Given a sequence of moments 
M 2r+1 = (M , Mi M 2r+ i ) , let ff 2r (M 2r+1 ) and 
H 2r +i (M 2r+ i) G E( r+1 ) x ( r+1 ) be the Hankel matrices 
defined b>Q 



[-H"2rL = M i+i _ 2 and [if^+iL = M i+J -_i. 



(8) 



jjj ITJ- ^ 1"^ Tijy 

The above matrices are called the moment matrices associated 
with the sequence M2 r +i. 

In general, an arbitrary sequence of numbers 
(Nq, Ni, Nk) may not have a representing measure 
\x such that J x r dfi = N r , for < r < k. A sequence of 
numbers = (Nq, Ni, Nk) is said to be feasible in 
57 C M. if there exists a measure /i with support contained 
in whose moments match those in the sequence The 
problem of deciding whether or not a sequence of numbers 
is feasible in £1 is called the classical moment problem in 
analysis 1 30 1 . For univariate distributions, necessary and 
sufficient conditions for feasibility can be given in terms 
of certain Hankel matrices being positive semidefinite]^] as 
follows I3T1 : 

Theorem 3.6: BTl Theorem 3.2] Let M 2r +i = 
(M ,Mi,...,M 2r+ i) G R 2r + 2 . Then, 

(a) The sequence M 2r +i corresponds to a sequence of 
moments feasible in ft = R if and only if H 2r > 0. 

(b) The sequence *M. 2r+ i is feasible in fi = [a, oo) if 
and only if 

H 2r y and H 2r+ i — aH 2r y 0. 

(c) The sequence M2, +i is feasible in f2 = (— oo, b] if 
and only if 

H 2r y and bH 2r - H 2r+ i y 0. 



Using Theorem 3.6 we have the following result: 



4 For simplicity in the notation, we shall omit the argument M2r+i 
whenever clear from the context. 

5 The notation A y means that the matrix A is positive semidefinite. 



Proof: Since M 2r +i is the moment sequence of a proba- 
bility density function fi with support on [a, b] C (— oo, b], we 



have from Theorem 3.6 (c) that M 2r +i satisfy H 2r y and 
bH 2r — H 2r+ i y 0. Since (3* (M 2r +x) is, by definition, the 
minimum value of x such that H 2r y and xH 2r — H 2r+ i y 
0, we have that /3* (M 2r+1 ) < b. ■ 

Remark 3.2: Observe that, for a given sequence of moments 
M 2r _|_i, the entries of xH 2r — H 2r+ i depend affinely on the 
variable x. Then j3* (m 2r +i) is the solutions to a semidefinite 
program^] (SDP) in one variable. Hence, (3* (!VI 2r _|_i) can be 
efficiently computed using standard optimization software, e.g. 
Il33ll . from a truncated sequence of moments. 



Applying Theorem 3.7 to the spectral density fj^ of a given 
graph T~L with spectral moments (mo, mi, m 2r +i), we can 
find a lower bound on its largest eigenvalue, Ai (A-u), as 
follows: 

Theorem 3.8: Let H be a weighted, undirected graph with 



(real) eigenvalues Ai 



> 



> A„. Then, given a trun- 



cated sequence of the spectral moments of %, m 2r+ i = 
(mo, mi, ...,m 2r+ i), we have that 

Ai {A n ) > p* r (mar+i) , (10) 

(where f3* (m 2r+ i) is the solution to the SDP in 

Proof: Let us consider the spectral density of "H, in 



Definition 3.1 According to Lemma 3.5 the density p u has 
associated moments m 2r+1 . Also, the smallest interval which 
contains the support of /i-^ is [a, b] — [A„,Ai]. Therefore, 
applying Theorem 3.7 to we obtain that (3* (m 2r+ i) < 
6 = Ai. ■ 

Furthermore, for r = 1, we can analytically solve the SDP 
in ^ to derive a closed-form solution for f3\ (m 3 ), as follows: 

Corollary 3.9: Let Q be a simple graph with adjacency 
matrix Ag. Denote by n, e, and A the number of nodes, edges, 
and triangles in Q, respectively. Then, 



Al (Ag) > 



3A+ ^/9A 2 + 8e 3 /n 



2e 



(11) 



Proof: In the Appendix. 



Using the optimization framework presented above, we can 
also compute upper bounds on the spectral radius of H from a 
sequence of its spectral moments, as follows. In this case, our 
formulation is based on the following set of Hankel matrices: 



A semidefinite program is a convex optimization problem that can be 
solved in time polynomial in the input size of the problem; see e.g. 1321 . 
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Definition 3.3: Given a weighted, undirected 
graph H with n nodes and spectral moments 

m 2r+ i = (m ,m 1 ,...,m 2r+ i), let T 2r {y;m 2r+1 ,n) 
and r 2r+i (j/;m 2r+ i,n) G R( r+1 ) x ( r+1 ) be the Hankel 
matrices defined bjj^J 

PH, - n " ,m i+j , - ' ,//"' 2 , d2) 



1 



2r+l. 



lJ n - 1 



n — 1 



j+i-i 



Given a sequence of spectral moments, we can compute 
upper bounds on the largest eigenvalue Ai {Ah) using the 
following result: 

Theorem 3.10: Let H be a weighted, undirected graph 
with (real) eigenvalues Ai > ... > A„. Then, given 
a truncated sequence of its spectral moments m 2r _|_i = 
(m , mi, m 2r+ i), we have that 



Ai < 5* (m 2r+ i,n) , 



where 



6* (m 2 r+i,«) = max y y 

s.t. T 2r h 0, 

yT 2r - T 2r +i h 0, 
T 2r +i + yT 2r h 0. 



(13) 



Proof: Let us define the bulk of the spectrum as the set 
of eigenvalues {A 2 ,...,A„}, and the bulk spectral density as 
the probability density function: 



n 
i=2 



We also define the bulk spectral moments as the moments of 
the bulk spectral density, which satisfy: 

m k [A-h) = / x k p H (ar) dx 



1 f 

V" / x k S (x - X t ) dx 

1 " 1 



71 / A \ 1 , fc 

= :Wt {Ah) 

n—1 n — 1 

Therefore, the moment matrices associated to the sequence of 
bulk spectral moments rh 2r +i = (mo, mi, m 2r +i), satisfy 



H s (m 2r+ i ) = T s (Ai ; m 2r+ i , n) , 



(14) 



for s G {2r, 2r + 1}, where and T s were defined in ^ 
and ^12) , respectively. 

Since |Aj| < Ai for i > 2, the support of the bulk 
spectral density is contained in the interval [— Ai,Ai]. 

7 We shall omit the arguments from T^ r and T2 r +i whenever clear from 
the context. 



Hence, according to Theorems |3.6| (/?)-(c), the sequence of 
bulk spectral moments m 2r +i must satisfy: 

T 2r (Ai;m 2r+ i,n) > 0, 

AiT 2r (Ai; m 2r+ i,n) - T 2r+1 (Ai;m 2r +i,n) h 0, 
T 2r +i (Ai;m 2r+ i,n) + AiT 2r (Ai;m 2r +i,n) h 0. 

Since 6* (m 2r _|_i,n) is, by definition, the maximum value of y 
satisfying the constrains in ( 13 1, we have that 6* (m 2r +i, n) > 
Ai {A n ). ■ 

Remark 3.3: The optimization program in ( fl"3"j ) is not an 
SDP, since the entries of the matrices T 2r {y; m 2r +i , n) and 
T 2r+ i (y;m 2r+ i,n) are not affine functions, but higher-order 
polynomials, in y. Nevertheless, the program can be cast into 
a convex optimization program, as follows. For the matrices 



in (13 1 to be positive semi definite, all their principal minors 
must be nonnegative, where each minor is a polynomial in 
y. In other words, positive semidefiniteness of the matrices 
in ( [T3] > is equivalent to a collection of polynomials in y 
being nonnegative. Hence, we can substitute the semidefinite 



constrains in (13 1 by a collection of polynomials in y being 



nonnegative. The resulting optimization problem is a Sum- 
Of-Squares (SOS) program |34|, which is a type of convex 
program that can be efficiently solved using off-the-shelf 
software ll35ll. 



In summary, using Theorems 3.4 3.8 and 3.10 



compute upper and lower bounds on the largest eigenvalue of 
a weighted, undirected network, Ai {Ah), from the set of local 
egonets with radius r, as follows: (7) Using (j4j), compute the 
truncated sequence of moments (mo, mi, m 2r+ i) from the 
set of egonets, {A^ r , i G V}, and (2) using Theorems 3.8 and 
3.10 compute the upper and lower bounds, S* (m 2r +i, n) and 
/3* (m 2r+ i), respectively. 



IV. Numerical Simulations 

In this section, we analyze real data from several social and 
communication networks to numerically verify the tightness 
of our bounds. In our first set of simulations, we study a 
regional network of Facebook that spans 63, 731 users (nodes) 
connected by 817, 090 friendships (edges) 11361 . In order to 
corroborate our results in different network topologies, we 
extract multiple medium-size social subgraphs by running a 
Breath-First Search (BFS) around a collection of starting nodes 
in the Facebook graph. Each BFS induces a social subgraph 
spanning all nodes 2 hops away from a starting node. As a 
result, we generate a set of 100 different social subgraphs, 
G = {Gi}i<ioo, centered around 100 randomly chosen nodes. 
For each social subgraph Gi G G, we compute its first 
five spectral moments ms (Gi) = (mi (G,) , ...,ms (Gi)) and 
use Theorems 3.8 and |3.10| to compute lower and upper 
bounds on the spectral radius, (3 2 {Gi) — (3 2 (m.5 (Gi)) and 
8* 2 {Gi) — S 2 (m.5 (Gi) , rii), where rii is the size of Gi. Since 
we have access to the complete network topology, we can also 
numerically compute the exact value of the largest eigenvalue 
Ai (Gi), for comparison purposes. It is worth remarking that, 
in many real applications, we do not have access to the 
complete network topology, due to privacy and/or security 
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constrains; therefore, we would not be able to compute the 
exact value of Ai. It is in those cases when our approach is 
most useful. 

Fig. 2 represents a scatter plot where each red circle above 
the dashed diagonal line has coordinates (Ai (Gi) ,5 2 (Gi)), 
and each blue circle below the dashed diagonal line has 
coordinates (Ai (G*) (G*)), for all d G G. We have 
also included a black line connecting every pair of circles 
associated to the same subgraph G,;. This black line represents 
the interval of possible values in which the largest eigenvalue, 
Ai (Gi), must lie. (Notice how the dashed diagonal line cut 
through all those segments.) For all the social subnetworks 
in G, the spectral radii Ai (Gj) are remarkably close to the 
theoretical bounds f3 2 (Gi) and 8% (Gi). In other words, in 
our collection of social subgraphs, local structural properties 
of the network strongly constrain the location of the largest 
eigenvalue, and consequently the ability of a social network 
to disseminate information virally. 

Our bounds are also tight for other important social and 
communication networks. In the following, we the compare 
the values of (3 2 and 5 2 with the largest eigenvalue Ai of an 
e-mail and an Internet network: 

Example 4.1 (Enron e-mail network): In this example we 
consider a subgraph of the Enron e-mail communication 
network ll37ll . Nodes of the network are e-mail addresses and 
the network contains an edge (i,j) if i sent at least one e-mail 
to j (or vice versa). The total size of the network is 36,692 
nodes, which is too large for us to manage computationally. In 
order to compare our bounds with the exact value of the largest 
eigenvalue, we analyze a subgraph obtained by a BFS of depth 
2 around a randomly chosen node. The resulting subgraph has 
n = 3, 215 nodes and e = 36, 537 edges. We also compute 
the value of its largest eigenvalue to be Ai = 95.18. Using 
Q, we have the following values for the first five spectral 
moments of the adjacency matrix: m x = 0, m 2 = 22.47, 
m 3 = 394.7, to 4 = 33,491, and m 5 = 2,603,200. From Q 
and (13 1, we obtain the following upper and lower bounds 
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on the largest eigenvalue: f3 2 = 78.53 < Ai < 98.74 = 8 2 . 
Notice that the numerical value of Ai is remarkably close to the 
upper bound 8 2 . Since the spectral radius measures the ability 
of a network to spread information virally, our numerical 
results indicate that the e-mail network spreads information 
very efficiently given the structural constrains imposed by 
the local egonets. We can also compare our bounds with the 
estimator in Q, corresponding to a random network with the 
same degree distribution. The value of the estimator is equal 
to Ai = 124.57, which is looser than our bounds. 

Example 4.2 (AS-Skitter Internet network): In this exam- 
ple, we consider a subgraph of the Internet network at the 
Autonomous Systems (AS) level. The network topology was 
obtained from the Skitter data collection in CAIDA |38|. Our 
subgraph was obtained from the complete AS graph using 
a BFS of depth 2 around a random node. The resulting 
subgraph has n = 2, 248 nodes, e = 20, 648 edges, and its 
largest eigenvalue at A x = 91.3. The spectral moments of its 
adjacency matrix are mi = 0, m 2 — 18.37, 777,3 = 341.1, 
77J4 = 40,001, and 7775 = 2,777,018. The resulting bounds 




Fig. 2. Scatter plot of the spectral radius, Ai (Gi), versus the lower bound 
02 (@i) (blue circles) and the upper bound 82 (Gi) (red circles), where each 
point is associated with one of the 100 social subgraphs considered in our 
experiments. 



from ^ and (B) are /3* = 74.72 < Ai < 93.94 = 8 2 . Notice 
how, the largest eigenvalue is again remarkably close to the 
upper bound, indicating that the network is able to spread 
information efficiently, given its local structural constrains. In 
this case, the estimator based on random networks produces 
a value of Ai^ = 219.1, which is very loose. Therefore, 
using random networks to analyze spreading processes in the 
Internet graph can be misleading. 

In conclusion, our numerical results validate the quality of 
the lower and upper bounds, (3 2 and 5 2 , on the spectral radius 
Ai in several social and communication networks. Our bounds 
provide an interval of values in which the largest eigenvalue is 
guaranteed to lie. This is in contrast with estimators based on 
random networks, which can be very misleading and present 
no quality guarantees. 

V. Conclusions 

A fundamental question in the field of mathematical epi- 
demiology is to understand the relationship between a net- 
work's structural properties and its epidemic threshold. For 
many virus epidemic models, the role of the network topology 
is characterized by the largest eigenvalue of its adjacency 
matrix, such that the larger the eigenvalue, the more efficient 
a network is to spread a disease (or a piece of information) 
virally. In many cases of practical interest, it is not possible 
to retrieve the complete structure of a network of contacts due 
to privacy and/or security constrains. Thus, it is not possible 
to exactly compute the largest eigenvalue of the network. On 
the other hand, it is usually easy to retrieve local views of 
a network, also called egonets, by extracting the structure 
of neighborhoods around a collection of chosen nodes. To 
estimate the value of the spectral radius when only egonets 
are available, researchers usually use random network models 
in which they prescribe local structural features that can be 
extracted from the egonets, such as the degree distribution. 
This approach, although very common in practice, presents a 
major flaw: Random network models implicitly induce many 
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structural properties that are not directly controlled and can be 
relevant to the spreading dynamics. 

In this paper, we have presented an alternative mathemat- 
ical framework, based on algebraic graph theory and convex 
optimization, to study how egonets constrain the interval of 
possible values in which the largest eigenvalue (and, therefore, 
the epidemic threshold) must lie. Our approach provides an 
interval of values in which the largest eigenvalue is guaranteed 
to lie and is applicable to weighted networks. This is in 
contrast with estimators based on random networks, which 
can be very misleading and present no quality guarantees. Our 
numerical simulations have shown that the resulting interval in 
which the largest eigenvalue must lie is very narrow for several 
social and communication networks. This indicates that, for an 
important collection of networks, the viral epidemic threshold 
is strongly constrained by local structural properties of the 
network. 

Appendix 

Corollary |3.9| Let Q be a simple graph with adjacency 
matrix Ag. Denote by n, e, and A the number of nodes, edges, 
and triangles in Q, respectively. Then, 



Ai (Ag) > 



3A+ ^9 A 2 +8e 3 /n 



2e 



Proof: From Corollary |3.2| we have that the first three 
moments of Q are rrii(Ag) = 0, mi(Ag) = 2e/n, and 
mz(Ag) = 6A/n (by definition, mo(Ag) = 1). Substituting 
the sequence of moments, m 3 = (1, 0, 2e/n, 6A/n), into d9J, 
we have that j3* (ma) is the solution to the following SDP: 



s.t. 



R(x) = 



x —2e/n 
-2e/n 2ex/n — 6A/n 



^0. 



The characteristic polynomial of R (x) can be written as 
4>{s;x) = det (si - R (x)) = s 2 - s tr (R (x)) +det (R (x)). 
Then, R (x) ^= 0, if and only if both roots of R (x) are 
nonnegative. By Descartes' rule, this happens if and only if 
the following two conditions are satisfied: 
(1) tr (R (x)) = x (1 + 2e/n) - 6A/n > 0, which implies 

6A a 



x > 



2e 



Xi. 



(15) 



(2) det (R(x)) = 2ex 2 /n - 6Ax/n - 4e 2 /n 2 > 0, which 



implies 



x > 



3A+ 79A2 



■8e 3 /n A 
= x 2 . 



We also have that, x 2 > 



2e 

3A+v / 9A^ _ 
2e 



(16) 



3A 



> 



a; 1 .Therefore, the minimum value of x satisfying ( 15 1 and ( 16 1 



6A 

2e+n 



is equal to the right hand side of ( 1 1 
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